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ABSTRACT 
This  report  presents  the  results  of  analysis  of  potential 
outliers  and  missing  data  points  in  J-D  data.   Treatments  of 
isolated  and  multiple  questionable  observations  (potential  outliers 
and/or  missing  data  points)  are  suggested  for  inclusion  in  the 
algoritnm  tor  smootning  3-D  data  using  a  7-point  least-squares 
method  tor  fitting  polynomials  of  order  tnree  or  less. 
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I  .     INTRODUCTION 

The  purpose  ot  this  report  is  to  present  the  results  of  a 
stucy  ot  methods  ot  treatment  ot  potential  outliers  (wila  aata 
values)  ana  missing  points  for  inclusion  in  an  algorithm  for 
smoothing  of  aata  at  NUWhJS.   Potential  outliers  and  missing  data 
points  can  contaminate  Doth  oata  smoothing  (Ref.  1;  and  geometric 
analysis  ot  vehicular  paths  (kef.  2). 

Data  usee  in  this  investigation  were  obtained  for  a  single 
trial  run  at  NUWES .   (This  run  was  labeled  Trial  3  2  by  tnis 
investigator.)  Two  vehicles  (A  and  B)  were  involved  in  tnis 
trial.   Plots  of  the  horizontal  and  vertical  paths  of  the  two 
vehicles  are  shown  in  Figures  la,b.  Missing  points  are  circled  in 
Figure  lb  and  denoted  by   M   in  data  lists.   Potential  outliers 
are  boxed  in  tigures  and  denoted  by   W   in  data  lists. 

Data  at  every  eighth  scneculed  data  collection  time  is 
missing.   In  addition,  tnere  are  other  missing  data  times. 
Temporary  values  ror  these  were  established  as  the  average  of  tne 
adjacent  values  ( Ket .  1).   Potential  outliers  are  identified  oy 
the  use  of  sequential  differences  (Ref.  J)  witn  any  tourtn  order 
difterence   (£4)  having  a  magnitude  of  51)  or  greater  being 
considered  a  potential  outlier.  (The  selection  of  tne  threshold 
of  50  is  somewhat  arbitrary  as  discussed  in  Reference  3.) 

Data  smoothing  in  this  study,  and  proposed  for  inclusion 
in  data  processing  at  NUWES,  uses  the  7-point  Least-Squares 
Polynomial  Regression  designed  for  7  consecutive  observations 
with  no  missing  data  (Ref.  1). 


a  general  discussion  of  the  magnitude  of  the  potential 
outlier  and  missing  point  problem  is  presented  in  Section  II 
Tneir  treatment  is  discussed  in  Section  III. 
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II  .   GHNtRAL 

The  magnitude  of  the  problem  of  potential  outliers  and 
missing  points  can  be  demonstrated  by  the  frequency  or  tneir 
occurrence  in  Trial  2.   Observational  times  for  vehicle  A   were 
■from  t  =  2U34  to  t  =  2379  and  included  296  oDservat  lonal  times. 
These  included  3b  scneaulea  missing  times  (M)  and  7  additional 
unscneduled  ones  for  a  total  of  4b  missing  oata  times.   There 
were  also  43  potential  outliers  (W)  in  this  path  with  b  or  tnerc 

/*  i  1  d 
ana  missing  data  is  shown  below. 


designated  as  ootn  W  ana  M.   A  summary  of  tne  occurrences  oi 


iDle  la   Vehicle  A 
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a  similar  examination  of  tne  patn  or  tne  venicie  n    was 
also  made.   Ubservat  lona  1  times  were  from  t  =  21)69  to  t  -  2353 
giving  2bo  observational  times.   Tnese  included  3b  scheduled 
missing  times  and  43  unscheduled  ones  for  a  total  of  3b.  There 
were  22  potential  outliers  in  this  data  none  of  which  were  als< 
missing  data  values.   This  is  summarized  in  Table  lb. 
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only  a  Drief  examination  of  the  extent  of  wild  ana  missing 
aata  was  made.   Their  causes  are  certainly  of  concern  to  data 
collection  personnel  but  procedures  for  treatment  are  of  concern 
tor  data  processing.   Some  general  comments  are  presented: 
(1)  There  are  about  seven  times  as  many  unscheduled  missing  aata 
times  for  tne  path  of  vehicle  B   as  there  are  for  that  of 
venicle  A  .   A  cursory  examination  suggests  that  these  are 
more  prevalent  in  tne  path  of  vehicle  B   immeaiately  follow- 
ing its  approacn  by  vehicle  A  .   Following  two  or  tne  tnrec 
approaches  in  tms  trial,  vehicle  A   was  closer  to  tne  near- 
est tracking  array  t.nan  vehicle  b  .  (between  them?   This  may 
oe  or  interest  to  aata  collection.  These  segments  of  the 
vehicular  patns  mav  oe  or  lesser  concern  for  data  processing 
necause  they  may  be  of  lesser  interest  to  tne  personnel  wno 
are  tne  users  of  tne  smootnec  aata.) 

(2)  Tnere  are  about  t*/ice  as  many  potential  outliers  in  tne 
path  of  venicle  A   as  in  that  of  venicle  B  .  Tnat  their 
frequency  is  greater  is  not  unexpected  since  the  vehicle 
b   was  aoing  less  maneuvering  (ostensibly,  on  a  straignt  line 
path).   That  8  of  the  missing  values  in  the  path  of  vehicle 
A   are  also  designatea  as  potential  outliers  should  not  be 
unexpectea.   The  temporary  value  insertea  for  missing  values 
using  linear  interpolation  between  acjacent  values  can  be 
expected  to  be  inconsistent  when  the  actual  path  is  not 
linear.   Note  that  none  of  the  missing  values  in  the  patn  of 
vehicle  A   were  also  designatea  as  potential  outliers. 


(3)   It  is  interesting  to  note  the  low  rate  ot  occurrence  of 

potential  outliers  in  more  than  one  coordinate  at  the  same 
observation  times.   tor  tne  path  ot  vehicle  b   tnese 
occurred  only  b  times  and  in  only  two  were  all  three 
coordinate  values  incicatea  as  potential  outliers.   2^  or 
the  43  potential  outliers  occurred  in  one  coordinate  only. 
Une  miyht  be  teraptec  to  expect  greater  multiplier  since  any 
discrepancy  in  data  r rom  tne  instrumentation  arrays  is 
transtormea  to  position  coordinates  anc  nence  would  oe 
expected  to  contaminate  the  values  or  ail  coordinates  at 
tnat  observational  time. 


III.  TREATMENT  OE  POTENTIAL  OUTLIERS  AND  MISSING  POINTS 
A.   General 

The  procedure  used  in  this  study  (and  proposed  for  data 
smoothing  at  NUWES  incorporates  a  7-point  Leas t-Squares ( L-s ) 
polynomial  computational  routine  to  treat  missing  points  and 
potential  outliers  and,  subsequently,  for  smoothing  the  rest 
of  the  data.   Since  missing  points  and  potential  outliers 
can  contaminate  the  smoothing  of  other  data  points,  they 
should  be  treatea  first. 

The  combination  of  seven  consecutive  points  tor  the 
smoothing  routine  and  the  regular  scheduling  of  missing 
points  (every  eighth  point)  complicates  the  treatment, 
operation  of  chance  would  dictate  that  only  one  time  out  of 
eight  would  a  potential  outlier  or  another  random  missing 
point  be  centered  in  the  seven  point  segment  between 
successive  scheduled  missing  points.   A  missing  point  or  a 
potential  outlier  centered  in  a  seven  point  segment  with  no 
other  missing  points  or  potential  outliers  will  be  called 
isolated .   These  are  the  easiest  to  treat.   The  presence  ot 
two  or  more  missing  points  and/or  potential  outliers  in  the 
same  seven  point  data  segment  calls  ror  more  caretul 
treatment.   as  ciscussea  in  Reference  1,  the  presence  of 
three  such  points  in  a  segment  should  be  f lagged  to  indicate 
to  potential  users  of  the  smoothed  data  that  the  data  is  ot 
reduced  quality. 

As  discussed  in  Reference  1,  isolated  missing  points 
or  potential  outliers  are  treated  by  iterating  the  7-point 


L-b  program  replacing  the  suspect  value  by  the  smoothed 
value  at  each  step  and  repeating  until  the  smoothed  value 
has  a  residual  error  well  within  the  noise  of  the  remaining 
values  in  the  segment.   Since  the  'noise'  in  the  NUWES 
system  has  a  standard  deviation  of  2   or  less  for  good 
quality  data,  the  value  of  1  has  been  selected  as  the 
magnitude  of  the  residual  error  for  stopping  the  iteration. 

The  treatment  of  multiple  missing  points  and/or 
potential  outliers  involves  the  same  procedure  with 
the  suspect  values  replaced  by  the  smoothed  values  at  each 
step  and  the  smoothing  continued  until  all  of  the  suspect 
values  have  residual  errors  within  the  specified  level  (1). 

A  few  missing  points  and  potential  outliers  in  Trial  2 
are  used  to  illustrate  the  smoothing  procedure.   These  are 
presented  in  the  next  section.  t\    7-point  L-b  Polynomial 
program  tor  the  TI59  hand-held  calculator  (see  Ret.  i)  was 
used  in  this  treatment. 


B.    Treatment  of  Isolated  Values 

1.    An  Isolated  Missing  Point 

The  isolated  missing  value  selected  for 
illustration  of  the  treatment  occurred  at  times  t=2118 
in  the  x  coordinate  of  vehicle  A  .   Data  in  the 
vicinity  of  this  point  are  presented  in  Figure  2  and 
Table  2.   Also  presented  in  Table  2a  are  the  sequential 
differences  . 

Three  iterations  of  the  7-point  L-S  polynomial 

smoothing  were  performed  (see  Table  2b,  columns  2,  3, 

and  4).   The  first  iteration  showed  a  residual  error 

of   r  =  -3.7b.   Replacing  the  temporary  value 

X   =  3381U.9  by  the  smoothed  value   X   =  J3,8l4.7 

and  pertorming  the  second  iteration  showed  the  residual 

error  reduced  to   r   =  -1.23.   Again,  replacing  by 

X   =  33, 815.  y  (the  smoothed  value  in  the  second  stage) 

and  iterating  resulted  in  the  smoothed  value 

X03  =  33, 81b. 3   with  the  residual  error  reduced  to 

r  =  0.42.   Since  this  residual  error  is  less  than  one 
o 

in  magnitude,  the  iteration  was  stopped.   Note  that 

tne  smoothed  value   X,,^   has  a  residual  error  within 

the  specified  limits.   The  thira  iteration  was 

necessary  to  establish  this.  The  residual  error   r  , .. 

J3 

will  be  even  smaller.   Since  the  third  estimate   X 
haa  to  be  determining  the  value   of   r   ,  it  is 
accepted  as  the  smoothed  value  tor  x  at  time  t  =  2128. 
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Figure  2   Missing  Point  (^ax)   t  =  z!2b 
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Examination  of  Figure  2  and  the  first  order  successive 
differences  (Al)  in  Table  2a  indicates  that  vehicle  A   was 
undergoing  a  change  in  path  in  the  vicinity  of  t  =  2128. 
Sequential  differences  were  recalculated  to  determine  if 
this  change  might  be  indicated  by  a  potential  outlier  pos- 
sible at  t  =  2129  or  t  =  2130.   These  values  are  also  pre- 
sented in  Table  2a.   The  fourth  order  sequential  difference 
at  t  =  2129  was  increased  in  magnitude  from  18.2  to  40.0  but 
does  not  exceed  the  threshold  of  51)  so  the  change  in  path 
was  not  detected  by  sequential  differences. 

Because  of  the  change  in  path  (maneuver)  of  vehicle  A, 
the  effect  of  shifting  the  segment  on  the  smoothed  value  was 
explored.   Segments  with  centers  at  t  =  2126,  2127,  2129, 
and  2130  were  fitted.   The  smoothed  values  obtained  are 
presented  in  Table  2c  together  with  the  residual  errors 
r  ,  at  t  =  2128  and  the  standard  deviations  (SDR)  of  the 
residual  errors  for  the  segments.  The  computations  are 
presented  in  Tables  2b. 

Table  zc    -  Varying  Segments  for  Smoothing 


Segment 

Smoothed 

X 

Res idual 

Std.  Uev. 

Center 

at  t  =  2128 

Error  ( r  ) 

(SDR) 

2126 

3  3810.3 

U  .  57 

1.57 

2127 

33811  .7 

-0  .82 

I  .06 

212  8  (M) 

33816.3 

-0.43 

3  .55 

2129 

3  3813.3 

-0  .3H 

3  .26 

2130 

33817.5 

-0  .81 

5.71 
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There  are  several  features  in  these  tables  that  are  worthy 

of  comment  as  follows: 

(a)   All  of  the  smoothing  applications  involved  a  third 
order  polynomial  since  the  standard  deviations  SDR 
of  the  residual  errors  was  smaller  for  the  cubic 
(SDR3)  than  for  the  linear  (Sbkl)  or  the  quadratic 
(SDK2).   The  cubic  poiynomical  used  to  tit  the 
data  segment  with  center  at  t  =  zl2b  is  of  the 
form 


since  the  coefficient  (b  )  of  the  third  order  term 
is  positive  (b   >  0).   For  all  other  segment 
centers  the  cuoic  is  of  the  form 


since  tne  coefficient  b   is  negative.   These 
results  suggest  that  tne  data  segments  centered  at 
t  =  2127  to  t  =  213U  included  positions  in  the 
maneuver . 


(b)  The  smoothed  values  for  x  at  time  t  =  2126    vary 
more  than  7  units  depending  upon  the  data  segment 
used  for  the  smoothings.   The  question  now  arises 
of  which  smoothed  value  provides  the  best  estimate 
of  the  x  coordinate  of  vehicle  A  at  time  t  =  2128. 
The  residual  error  at  t  =  2I2d  provide  no  help 
here  since  it  could  oe  reduced  to  zero  by 
repeated  iteration. 

Note  that,  as  discussed  in  the  smoothing  of 
the  segment  centered  at  t  =  2i2b,  the  residual 
error  is  the  difference  between  the  temporary 
value  before  the  last  iteration  and  the  smoothed 
value  after  that  iteration  and  hence  does  not 
represent  an  error  in  the  smootned  value.   It 
should  be  noted  also  that  turther  iteration  to 
reduce  the  residual  error  at  t  =  2l2d   will  only 
produce  small  reductions  in  the  standard 
deviations  or  the  residual  errors  ot  the  segments 
since  the  purpose  of  the  iterations  is  to  reduce 
the  residual  error  at  that  point  to  a  value  well 
within  the  residual  errors  at  tne  other  points  in 
tne  same  data  segments  (i.e.,  small  contribution 
witn  respect  to  tne  'noise'  in  the  segments). 

(c)  of  greater  use  for  selecting  the  most  appropriate 
data  segment,  and  consequently,  of  the  most 
appropriate  estimate  for  x  at  t  =  zlzti,    are  the 
values  or  the  standard  deviation  of  the  residual 
errors  (sUR,  in  Table  2c).   The  standard 


deviation  of  the  residual  errors  is  used  in 
establishing  confidence  intervals  for  the  actual 
value  of  the  dependent  variable  (x)  with  smaller 
values  producing  narrower  confidence  intervals 
(Ref.l).   The  data  segment  centered  at  t  =  2126 
had  the  smallest  standard  deviation  and  hence 
could  be  considered  to  give  the  preferred 
estimate . 

The  variation  of  the  width  of  the  confidence 
interval  with  the  degree  of  polynomial  used  to  fit 
the  data  segment  and  with  the  location  of  the 
missing  data  point  within  the  segment  has  not  been 
fully  explored.   The  first  degree  polynomial  was 
treated  in  Reference  i  but  similar  expressions 
for  confidence  intervals  when  second  and  third 
order  polynomials  are  used  needs  further 
development . 
(d)   There  should  be  some  concern  about  tne  effect  of 
the  change  in  vehicular  path  on  the  smoothing  of 
the  data.   This  change  occurred  in  the  vicinity  of 
times  t  =  2128  or  2129.   (See  Fig. 2) 

Note,  one  possible  explanation  for  the 
increase  in  the  value  of  SDR  as  the  center  of  the 
data  segment  is  shifted  is  an  increase  in  the 
'noise'  level  in  the  observations.   Another  is  the 
inability  of  the  third  order  polynomial  to 
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represent  the  actual  vehicular  path  adequately. 
In  order  to  avoid  the  latter  possibility  it  would 
appear  desirable  to  avoid  smoothing  data  with 
segments  including  rapid  changes  in  vehicular 
paths.   Referring,  again,  to  figure  2,  it  can  be 
seen  that  a  rather  abrupt  change  in  the  vehicular 
path  is  apparent  at  time  t  =  213U  but  that  the 
observation  at  time  t  =  2l2y  appears  to  oe 
consistent  with  the  preceding  values.   Thus 
exclusion  o£  the  observation  at  t  =  213U  from  the 
segment  would  lead  to  using  tne  7-point  data  seg- 
ment centerec  at  t  =  2126.  Further,  this  same 
segment  should  also  be  usee  tor  suosequent 
smoothing  of  data  values  at  times  2127  and  2129 
instead  of  using  data  segments  centered  at  those 
times.   (Note  that  this  suggestion  of  using  the 
data  segment  centered  at  t  =  2126  to  smooth  the 
value  tor  the  missing  point  at  t  =  2126  is  in 
accord  with  the  discussion  in  comment  c  above.) 
(e)   The  major  guidelines  in  developing  a  data  smooth- 
ing algorithm  tor  use  at  iNUVvtS  included: 

(1)  tne  resulting  data  smoothing  program  should 
oe  as  fully  automated  as  possible,  and 

(2)  the  resulting  data  smoothing  program  should 
he  as  simple  and  short  as  possible. 

These  two  guidelines  are  contradictory  when  it 
comes  to  treatment  of  changes  in  vehicular  oaths. 


IS 


It  coula  be  very  awkward  to  construct  suoroutines 
to  implement  automatic  ident it icat ion  of  the  times 
of  changes  in  vehicular  paths.   un  the  other  nand, 
manual  screening  of  the  data  to  identity  such 
times  would  reduce  the  level  of  automation. 

Fortunately  there  is  another  source  of  infor- 
mation that  could  be  made  available  to  provide 
this  information.   This  is  the  internal  control 
information  collected  from  the  vehicles.   It  is 
strongly  recommended  that  this  source  of  informa- 
tion be  explored  with  the  intent  of  including  it 
with  the  data  to  be  smoothed. 
2.    An  Isolated  Potential  uutlier 

As  discussed  in  Section  IIIA,  isolated  potential 
outliers  are  rare.   une  occurrence  in  the  trial  used  in 
this  report  was  tne   y   coordinate  of  vehicle  A  at  time 
t  =  zzbrt.      The  data  in  the  vicinity  ot  tne  potential 
outlier  is  presented  in  Table  Ja  together  with  the 
seguential  difference  (A4).  At  t  =  22bd      A4  =  75. i 
which  exceeds  the  selected  threshold  magnitude  of  5U 
and  nence  the  y  value  at  t  =  22bb  is  indicated  as  a 
potential  outlier.   a  plot  of  the  data  is  also 
presented  in  Figure  3. 

Treatment  of  a  potential  outlier  is  the  same  as 
than  for  an  isolated  missing  point.   Four  iterations 
were  reguired  to  ensure  tnac  the  smoothed  value  to  be 
used  as  a  replacement  tor  tne  potential  outlier  was 
consistent  .vith  tne  other  six  values  in  tne  data 
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figure  3   Isolated  Potential  Outlier    2AY   t  =  2268 
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Table  3a 
SEQUENTIAL  DIFFERENCES  ir'UK  AIM  ISOLATED  POTENTIAL  OUTLIER 


1 
1 

betore  Treatment 

After  Treatment 

t 

y 

A4 

A4 

22b  J 

-bb27 . 6 

2264M 

-b7U4.2 

22b3 

-Ub73l .6 

la.i 

2266 

-bfclbtt.ci 

-y.i 

b  .  3 

22b7 

-b9b7 . b 

jb  .  1 

-25.  5 

22bb 



-7U4b .a 

-  7  b  .  i  'a 

1  .9 

22b9 

-7107.5 

42. b 

11  .8 

<i27U 

-7173.5 

-3  .2 

12  .  2 

2271 

_  _ 
-7244 .1 

5.6 

22  7  2M 

-7319. a 

227i 

-7395.5 

reatment:   observed  value  -j    =  7U4b.a  at 

by  smoothed  value  j     =  7Ujj.4 


=  2  2bd  rep  1  i 
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seyment,  i.e.,  that  the  residual  error  of  the  smoothed 

value  was  less  than  the  specified  magnitude  of  one. 

The  fourth  iteration  was  required  to  determine  whether 

the  smoothed  value  obtained  in  the  third  iteration 

satisfied  this  criterion.   As  in  the  treatment  of  an 

isolated  missing  point  (Section  III  Bi  )  ,  the  smoothed 

value   x.  =  7033.4   established  in  the  fourth  iteration 
4 

was  selected  as  the  replacement  for  the  observed  value 
X-,  =  7048.  a   and  will  have  a  residual  error  about  the 
fitted  curve  which  is  less  than   r   =  u.3y.   The  itera- 
tions conducted  on  a  TI59  are  presented  in  Table  3b. 

There  are  several  features  of  this  treatment  which 
are  worthy  of  comment. 

(a)  sequential  differences  were  recalculated  atter 
replacing  the  potential  outlier.   These  are  pre- 
sented in  the  right  hand  part  of  Table  3a.   The 
fourth  order  difference  at  t  =  2268  has  been  re- 
duced in  magnitude  from  75.1  to  1.9  and  elimina- 
tion of  the  contamination  of  the  fourth  order 
differences  at  the  adjacent  times  has  also  reduced 
their  magnitude. 

(b)  It  is  of  some  interest  to  note  that  in  the 
first  two  iterations  a  second  order  polynomial 
(parabola)  provided  the  best  fit  (smallest  SDR) 
but  a  tnird  order  polynomial  (cubic)  gave  a 
slightly  better  fit  in  the  last  two  iterations. 
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Table  3b 


isolated  Potential  uutlier  ( 2AY ) 


Smoothing  Iteration 

yi 

yK2) 

y  i  (  3  ) 

Yl(4) 

2265 

22bb 

22b7 

22bb  W 

22b9 

227U 

2371 

-b7bl  .8 
-bb66.b 
-6937. b 
-7U4b.8 
-71U7.D 
-7173.5 
-7/44  .  1 

-7038.4 

-7034  .9 

-7033.8 

SDK  1 
SDR  2 

SDK  3 
K 

b3 

bZ 

bl 

bo 

14  .157 
b.947 
7.819 
2 

3  .  103b 
-7b  .b33b 
-12.4143 

11  .141 
3.504 
J  ,b27 
2 

2  .  b083 
-7b . 033b 
-10  .4333 

1  0  .  3  3  j. 

2  .877 

z.797 

3 

-U.2111 

2.4417 

-75.175b 

y . 7bo7 

10  .  109 

Z  .804 

z.by7 

3 

-0.  llll 

2  .  3by3 

-75.  173d 

-9.5571 

y-3 
y-z 

y-l 

yo 

y[ 

y2 
y3 

-b73U  .  5 
-6872.7 
-b93b.7 
-70 3b  .  4 
-7112.0 
-7179.3 
-7240.4 

-6781  .5 
-ob71  .2 
-b955.7 
-7054.9 
-7108  .0 
-7177.8 
-7241  .4 

-b730  .6 
-b872  .0 
-695b  .0 

-7035.8 
-710b. 7 
-717b. 1 
-7248.0 

-b7d0 . 7 
-6871 .8 

-by  55 . b 
-7085  .4 
-7106.4 
-7175.9 
-7243 .1 

r-3 
r-l 

ru 

rl 
r2 

r3 

-I  .28 
3.89 
1  .lb 
-10.39 
4  .  4b 

-3.  b6 

-0  .  29 
2.41 

-1.81 

-J  .43 
1  .49 
4.32 

-2.  o7 

-1.22 

3.17 

-1.33 

-1.12 

-0.77 

Z   .  33 

-1  .07 

-1.12 

3  .02 
-1.8b 

0  .  jy 

-1.08 

Z  .  40 

-0.96 

c)  The  reduction  in  the  residual  error  in  the 

potential  outlier  and  in  the  standard  deviation 

(SDR)  of  the  residual  errors  of  the  data  segment 

are  worthy  of  notice.   The  residual  error  was 

r  =-10.39   for  the  potential  outlier  and  the  stan- 
di 

dard  deviation  of  the  residual  errors  (the  differ- 
ences between  observational  values  and  smoothed 
values)  was   SDK2  =  6.95.   The  third  iteration  of 
smoothing  replaced  the  potential  outlier  value  ot 
X  =  -7U48.8   by  x.,  =  -7U3J.8  whicn  as  a  residual 
error  of   r  =-u.4   and  the  standard  deviation  of 
the  residual  errors  was   bbk3  =  2.7U  (established 
in  the  fourth  iteration). 

d)  The  magnitudes  of  all  of  the  residual  errors  when  a 

data  segment  is  smoothea  is  of  some  concern.   This 
is  represented  by  the  value  of  the  SDR  which  was 
somewhat  larger  in  all  but  one  of  the  data 
segments  examined  in  the  previous  subsection 
(Illbl)  where  an  isolated  missing  point  was 
considered.   There  is  always  some  reservations  in 
the  mind  ot  this  investigator  (and  should  be  in 
the  mind  or.    any  potential  user  ot  the  smoothed 
data)  whether  a  iarger  value  of  the  SDK  is  caused 
oy  inadequacy  of  the  model  (polynomials  ot  order 
three  or  lower)  or  an  increase  in  the  level  or 
noise  in  the  data. 
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Inadequacy  of  the  model  is  not  limited  to  major  changes 
in  a  vehicular  path  as  apparently  occurred  in  the  missing  point 
example  but  could  be  produced  by  fish-tailing  (snake  action)  for 
vehicular  control  or  minor  corrections  in  attack  path.   A  higher 
data  rate  would  improve  the  smoothing  capabilities  for  following 
such  higher  frequency  oath  variations  by  allowing  use  of  longer 
path  segments  and/or  higher  order  polynomials  as  well  as  improved 
smoothing  capabilities  even  when  such  path  anomalies  were  not 
present . 

The  presence  of  an  unscheduled  missing  point  or  of  an 
outlier  when  the  bDK  tor  the  residual  error  is  large  should  not 
be  unexpected.   It  should  serve  as  an  indication  that  the 
position  location  system  is  having  difficulty  in  obtaining 
consistent  data  on  the  vehicular  path. 

The  inability  of  the  smoothing  procedure  to  distinguish 
between  inadequacy  of  model  and  noise  as  the  cause  for  larger 
values  of  tne  SDK  should  be  recognized  as  a  dirterent  kind  of 
inaccuracy  ot  the  model.   In  the  development  of  the  Least-Squares 
Model  it  was  assumed  tnat  the  noise  components  or  tne  odserved 
values  were  independent.   Any  persistence  in  tne  noise  component 
is  thus  treated  as  a  portion  of  the  actual  path  component.   as  an 
extreme  example,  any  constant  portion  ot  the  noise  component  tnat 
persists  over  an  entire  data  segment  will  result  in  a  oias  in  the 
smootned  path,  i.e.,  in  an  offset  jf  the  smoothed  path  from  the 
actual  oath  ot  the  vehicle. 
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3.   An  Isolated  Missing  Point/Potential  Outlier 

The  fact  that  the  temporary  replacement  ot  an 
isolated  missing  point  by  the  average  ot  the  values  at 
the  adjacent  points  can  produce  a  value  which  is 
identified  by  the  sequential  differences  as  a  potential 
outlier  is  illustrated  by  the   x-coordinate  of 
vehicle   A   at  time  t  =  2136.   The  data  segment  and 
the  sequential  differences  are  presented  in  Table  4a 
and  sketched  in  Figure  4.   The  four  smoothing 
iterations  are  shown  in  Table  4b. 

The  treatment  here  is  not  different  t rom  that  of 
an  isolatea  missing  point  or  a  potential  outlier.   It 
is  included  in  this  report  to  illustrate  that  the 
temporary  replcement  of  a  missing  point  by  the  average 
of  the  adjacent  points  is  actually  using  a  2-point 
straight  line  fit  and  hence  may  be  substantially 
different  rrom  the  actual  value  ot  the  component  when 
the  vehicle  is  not  traveling  in  a  straight  line. 

One  other  side  comment  that  may  be  of  interest  is 
the  magnitudes  of  the  SDRs  in  the  second  and  third 
smoothing  iterations  in  comparison  with  the  actual 
values  of  the  residual  errors.   The  bL)K  of  the  residual 
errors  is  larger  than  any  of  the  residual  errors  in 
these  iterations.   This  is  a  consequence  ot  using 
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instead  of  the  root-mean-square 


figure  4   ISOLATED  MISSING  POINT/PUTENTIAL  UUTLItk 
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Table  4a 
ISOLATED  MISSING  POINT/POTENTIAL  OUTLIER    2AX 


1 

Before  Treatment 

After  Treatment 

t 

X 

x*  =  d>l ,  4D2  .  2 

A4 

A4 

2131 

33  ,794.2 

2132 

33,726.1 

12  .8 

2133 

33,637.7 

-23.4 

2134 

33 , 536  .  5 

34.3 

20.1 

213d 

33 ,486.  b 

-as.  2 

-31.6 

2136M 

33 ,466  .5 

108.7 

23.2 

2137 

J>  J  ,  44b  .  3 

-82.  d 

-25.5 

— 

-- 

-- 

2i8« 

^^ ,433  .  1 

4.8 

-9.4 

-- 

-- 

-- 

2139 

l 

jj  ,dd9  .  3 

-0.6 

-- 

-5.9 

214U 

J3  ,  650  .  2 

2141 

i 

33,738  .5 

1      2  A 

RMS  =  [  -   [  r.']  2 


when  a  cubic  polynomial  is  fitted  to  a  data  segment  of 
n=7  observations.   The  unbiased  estimator  SDK  or  the 
standard  deviation  for  the  noise  component  is  consider- 
ably increased  over  the  RMS  value  because  the  data 
segment  is  so  short.   An  increase  in  the  data  rate  to 
increase   n   is  desirable.   Note  that  a   ktn   order 
polynomial  would  require  a  divisor  of   n-(k+l)  since  it 
would  involve   k  +  1  coefficients.   Thus  a  substantial 
increase  in   n   (e.g.,  doubling  the  data  rate)  would 
permit  some  increase  in  the  order  of  polynomials  con- 
sidered for  fitting  the  data  segment  without  making  the 
value  of  SDR  unrepresentative  of  the  residual  errors. 
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Table  4b    Isolated  Missing  Pcint/Potent ial  Outlier    ( 2Ax 


Smoothing 

I terat ion 

t . 

1 

X  . 

1 

x  ' 
1(2) 

x  ' 

xi(2) 

X  ' 

1(4) 

2133 

33,637.7 

2134 

33,556.5 

2135 

33,486.6 

2136  M,W 

33,466.5 

33,456.8 

33,453,6 

33,452,5 

2137 

33  ,446.4 

2138 

33,485.1 

2139 

33,539.3 

SDK  1 

64.974 

86  .665 

67 . 26b 

67.477 

SDR  2 

9.428 

7.599 

7.372 

7  .  346 

SDR  3 

7.545 

3.922 

3.294 

3.21b 

b 

3 

3 

3 

3 

D3 

U  .925 

U .9250 

U.925 

0.925 

^2 

15.7179 

16.  1798 

16  .  3321 

16.3845 

bl 

-21  .4143 

-21.4143 

-21 ,4143 

-21 .4143 

bu 

-02.8714 

-64.7190 

-63.  328b 

-63o538l 

X-3 

33,637.6 

33  ,638.5 

33  ,638.8 

33,636.9 

x  '  ., 

—  Z 

33,555.1 

33,553.8 

33,553.3 

33,533.1 

X-l 

33,493.1 

3  3,490.3 

33,489.4 

3  3,439.1 

xo 

33  ,456  .8 

33.453.6 

33,452.5 

33,452.2 

Xi 

33,452.1 

33,449.  3 

33,448.4 

33  ,448.1 

X2 

33,484.  3 

33 ,482.9 

33,482.4 

33,432.3 

x  ' 

3 

33  ,  5^9  .  0 

33,  D6U.U 

33  ,560.3 

33  ,560.4 

"-3 

0.13 

-0.80 

-1.10 

-1.20 

r  ' 

-  z 

1  .36 

2.74 

3  .20 

3.36 

-1 

-6  .43 

-3.68 

-2.  76 

-2.43 

ru 

9  .bo 

3  .  19 

1  .06 

U.32 

r  ' 

i 

-5.77 

-3  .UU 

-2.09 

-1.77 

r  ' 

0.81 

2  .  2U 

2  .36 

2  .81 

r3 

U  .  26 

-U  .  66 

-U  ,9b 

-1.07 
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C.    Treatment  of  Multiple  Values 
1.   General  Considerations 

When  more  than  one  missing  point  and/or  potential 
outlier  occur  in  the  same  7-point  data  segment  the 
selection  of  the  appropriate  treatment  is  more  diffi- 
cult.  Treatment  of  data  segments  containing  J  or  more 
values  which  are  either  missing  points  or  potential  out- 
liers reguire  additional  considerations  and  will  be 
postponed  until  the  next  section  (Sect.  D).   Only 
occurrences  of  two  such  values  will  oe  examined  here. 

Treatment  of  two  such  suspect  values  must  take 
into  consideration  the  differences  in  the  nature  of 
suspect  values  as  well  as  their  location  in  a  7-point 
segment.   There  are  three  possible  procedures: 
a)    Smooth  first  one  using  iterations  as  necessary, 

then  the  other  using  the  smoothed  value  for  the  first. 
It  would  appear  advisable  to  resmcoth  the  first  again 
after  the  second  is  smoothed.   The  question  arising  hers 
is  wnich  value  should  be  smoothed  first.   In  the  case  of 
two  potential  outliers  it  ^oulo  appear  reasonable  to 
smooth  first  on  tne  one  with  the  largest  rourth  order 
difference   (A4)   as  representing  the  greater  potential 
contammator .   In  the  case  of  a  potential  outlier  and  a 
missing  point  it  would  appear  reasonable  to  smooth  the 
potential  outlier  first  for  the  same  reason.   In  the 
case  of  two   missing  points  this  reason  is  not  pertinent 
and  a  reasonable  procedure  would  be  to  smooth  the  one 
that  occurred  first  in  time  tor  computational  simplicity 
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(b)  Alternate  smoothing  iterations  centered  on  first  one 
time  then  the  other,  continuing  the  iterations  until  the 
residual  errors  of  both  are  within  the  prescribed  limits. 
This  procedure  requires  more  computational  effort  since 
the  7-point  segments  shift  between  each  iteration.   There 
is  also  the  possibility  that,  because  different  data 
segments  are  involved,  both  residual  errors  cannot  be 
reduced  to  tne  prescribed  level  simultaneously. 

(c)  Simultaneous  smoothing  of  the  two  values  using  a 
single  7-point  segment.   The  question  here  is  where  the 
segment  should  be  centered.   This  selection  should  take 
into  consideration  the  quality  of  the  resulting  smoothed 
values . 

As  discussed  in  Reference  1,  the  quality  of  a 
smoothed  value  can  be  expressed  in  terms  of  the  width  of 
the  confidence  interval  tor  the  actual  value  at  any  time. 
This  confidence  interval  is  of  the  form 


ciw(xt)  =  (X(t)  ±  t   s  V£  -     7^2 

)  ( t . -t 

1 


.2 

7  +  IS   '         U) 


when  tne  t  values  are  translated  to  t  =  -3,  -2,  -1,  0,  1, 
2,  3  for  the  7-point  segment  when  the  fitting  polynomial 
is  linear.   (The  comparable  forms  for  quadratic  and  cubic 


polynomials  has  not  been  explored.)   Thus  the  values  to 
be  smoothed  should  be  as  close  to  the  center  of  the 
segment  as  possible  since  the  confidence  interval  will  be 
shortest  when  t   =  0. 

Situations  in  which  adjacent  points  are  both 
potential  outliers  are  unlikely  occur  with  the 
identification  procedure  specified  (Ref.  3)  since  only 
the  point  having  the  largest  fourth  order  difference  (A4) 
exceed-ing  the  prescribed  level  ( 5U )  has  been  so  labeled. 
(To  guard  against  outliers  close  to  each  other, 
sequential  differences  should  be  recalculated  whenever  a 
potential  outlier  has  been  smoothed.)   This  is 
illustrated  in  Section  III  C  2. 

Situations  in  which  adjacent  points  consist  of  a 
potential  outlier  and  a  missing  point  should  oe  treated 
simultaneously  using  the  data  segment  centered  on  the 
potential  outlier  since  it  contaminates  the  temporary 
value  assigned  for  the  missing  point.   This  is  illus- 
trated in  Section  III  C  3. 

For  situations  with  adjacent  missing  points,  sim- 
ultaneous smoothing  is  again  recommended.   It  is, 
however,  ambiguous  as  to  which  one  should  be  used  as  the 
center  of  the  data  segment  used  for  smoothing.  This  is 
examined  in  Section  III  C  4  for  one  such  occurrence  in 
the  trial  run. 
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When  a  missing  point  occurs  in  the  7-point  data  segment 
centered  at  a  potential  outlier  but  is  not  adjacent  to 
it,  there  is  some  question  as  to  wnether  it  should  be 
smoothed  simultaneously  or  subsequent  to  the  treatment 
ot  the  potential  outlier.   This  has  not  been  examined 
but,  on  the  principle  of  making  the  associated  confidence 
interval  as  short  as  possible,  the  latter  would  appear 
preferable . 

For  two  missing  points  in  the  same  7-point  data 
segment  which  are  not  adjacent,  the  treatment  can  be 
different  depending  on  their  separation.   If  they  are 
separated  by  only  one  point  the  possibility  of  simultan- 
eous smoothing  using  that  point  as  the  center  of  the  data 
segment  would  be  advantageous  from  a  computational 
viewpoint  and  would  not  cause  a  substantial  increase  in 
the  width  of  the  confidence  interval.   This  can  be  seen 
in  tne  factor 

1  t^_        1  +  1_ 

n     -2  7  T  2  6 

^  i 

for   t  =  ±  1   in  Equation  (1).   This  situation  is 
examined  in  Section  III  C  5.   The  situation  when  two 
missing  points  are  separated  by  two  other  points  is  also 
explored  in  Section  III  C  5. 
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III.  C  2    Two  Potential  Outliers 

As  discussed  in  Reference  3,  a  large  fourtn  order 
sequential  difference   (A4)   indicating  a  potential  outlier  is 
typically  accompanied  by  large   A4's   for  the  adjacent  values  but 
with  opposite  signs.   These  may  also  exceed  the  specified 
threshold  but  should  not  initially  be  labeled  as  potential 
outliers.   This  is  illustrated  in  the  Table  ba  and  figure  b. 
Note  that  the   A4's  at  times  2212,  2213,  and  2214  all  exceed  the 
threshold  bU,  that  their  signs  alternate,  and  that  the  magnitude 
of   A4   at  2213  is  largest.   Only  the  value  of   z   at  t  =  2^U 
should  be  considered  a  potential  outlier.   smoothing  this  value 
(Table  5b)  and  recalculating  tne   A4's  verities  that  the  values 
at   t  =  2212  and  2214  are  not  potential  outliers  and  that  their 
A4's  were  contaminated  by  the  designated  outlier  at  t  =  2213. 

If  the  second  calculation  indicates  another  potential 
outlier  in  the  vicinity  of  the  first  one,  tnen  the  suggested 
procedure  tor  smoothing  can  aepena  on  tneir  separation.   The  data 
from  Trial  Kun  #2  was  not  examined  to  see  whether  this  occurred. 
Treatment  for  such  a  situation  is  the  same  as  that  tor  two  missing 
joints  and  will  be  presented  in  Section  III  C  3. 
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Table  5a 
Potential  Uutlier  at   t  =  2213  (  2bZ  ) 


t . 

1 

t  . 

1 

z  . 

i 

A4 

A4X  [z  =  4U2.3J 

-4 

220  9  M 

-393.3 

-3 

221U 

-379.1 

42.3 

-2 

2211 

-386.5 

5.5 

-19.0 

-1 

2212 

-3y4.8 

-98.2 

-2.2 

U 

2213  W 

-377.8 

144.9 

-2.1 

1 

2214 

-407.5 

-88.8 

-9.2 

2 

2214 

-411.0 

-2.6 

-27.1 

3 

2216 

-404 .2 

26.7 

4 

2217 

-405.6 

-  *7 


370   - 


-jau 


-39U 


-40U 


■410 


-4 


-1 


figure  5    Potential  uutlier  at  t  =  22lJ   (  2bZ ) 
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Table  5b    Smoothiny  Potential  Outlier  in  2t5Z  at  t  =  221b 


t 

t1 

Z  . 

1 

Zl(2) 

' 

Zi(3) 

1(4) 

221U 

2211 

2212 

221bW 

2214 

221b 

2216 

-3 
-2 
-1 
0 
+  1 
+  2 
+  b 

-379.1 

-386b. 5 

-394.8 

-377.8 

-407.5 

-411. U 

-4U4.2 

-394 .4 

-400  .0 

-401.9 

SDRl 
SDR2 
SUR3 

9.435 
1U.548 
11  .842 

5.093 
4.320 
4.093 

b.096 
2.854 
1.651 

d.332 
2.636 

1  .Obb 

R 

bl 

bu 

1 

-4  .3929 

-394 .41 

3 
.33611 

.8095/ 

-7  .  2430 

-3.  2btf  1 

3 
.  33611 

1  .1)762 

-7.  245b 

-4  .  b04tt 

b 
.  33611 

1  .1667 

-7.24bb 
-4  .  bbb7 

Z-2 

zu 

2  ' 
+  2 

-379.7 
-384.6 
-389. 5 

-394.4 
-b99 . 3 
-4U4 .2 
-409  .1 

-b80  .  1 
-38b. 0 
-392.3 
-400.0 
-406  .  1 
-408  .6 
-405  .4 

-379. b 
-38b. 8 
-393.9 
-401  .9 
-407.7 
-409.4 
-404.9 

-379.4 
-386.1 
-394.4 
-402. 5 
-408.  3 
-409.  7 
-404.  7 

r-3 
r-2 
r-l 

o 

ri 

r  2 

r  . 
b 

.64 
-1.87 
-5.28 
16  .61 
-3.  19 
-6  .8 
4  .  39 

.98 
-1.52 
-2.50 

b.62 
-1.38 
-2.41 

1  .20 

.44 
-.72 
-.90 
1  .S^ 

.  22 
-1.61 

.67 

.  2b 

-.45 
-.35 

.  o2 

.77 

1  .34 
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III.  C  3   A  Potential  Outlier  and  a  Missing  Point 

When  a  missing  point  is  adjacent  to  a  potential  outlier 
its  temporary  value  is  the  average  of  the  potential  outlier  and 
the  neighboring  value  on  its  other  side.  It  would  appear 
reasonable  for  this  situation  to  smooth  the  two  values 
simultaneously  using  the  data  segment  centered  on  the  potential 
outlier.   An  example  of  this  occurred  at  times  2175(W)  and  2176(M) 
in  the   x   coordinate  of  vehicle  A  in  Trial  2.   The 
appropriate  data  is  presented  in  Table  6a  and  Figure  6.   The 
TI  5^  calculator  output  is  shown  in  Table  6b.   Sequential 
differences  were  recalculated  since  a  potential  outlier  was 
present  and  the   A4*'s  are  also  presented  in  Table  6a. 

If  a  missing  point  occurs  in  the  data  segment  centered 
at  a  potential  outlier  but  is  net  adjacent  to  it,  then 
simultaneous  smoothing  may  not  be  appropriate.   Note  that  the 
factor 

7     28 

in  determining  the  widtn  of  the  confidence  interval  is 

14  l     y   13 

^   +  —   =   -        for  t  =  ±  2   and    ■=   +   ^7  =  ^-7    for  t  =  ±  3.  Thus 
/    zo     /  /      zo    zo 

the  width  of  the  confidence  interval  at  these  times  is 
substantially  increased.   It  would  appear  reasonable  in  such 
situations  to  smooth  the  potential  outlier  first,  then  smootn  the 
missing  point,  and,  if  desired,  to  resmooth  tne  potential  outlier. 
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Table  ba  Adjacent  Potential  Outlier  and  Missing  Point 


t1 

t 

X. 

l 

A4 

A4" 
(Xu=34,4b3.4) 

(Xl=34, 477.1) 

-3 

-2 

-1 

U 

+  1 

+  2 
+  3 
+  4 

2172 

2173 

2174 

217b  w 

21  7b  M 

2177 

2178 

2179 

34,341.7 
34, 39b. 5 
34, 43b. b 
34,472.3 
34  ,473. a 
34,475.2 
34  ,4bD  .3 
34  ,4  34  .5 

-30.8 
17.7 

-3b. 1 
63.1 

-45.6 

1.8 

21.4 

-9.6 

-22.  U 

2.8 

-3.7 

1U  .1 

-2U.  5 
Z4.7 

34,475 


34,450 


34  ,425 


34  ,400 
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34,325 


i      i      i      i      i      i      i      i      i 
-4-3-2-1     0     1     2     3     4 


Figure  6  Adjacent  Potential  Outlier  and  Missing  Point 


Table  6b   Smoothing  Adjacent  Potential  Outlier  and  Missing  Point 


t 

t1 

X  . 

l 

Xi(2) 

X  i  (  3 ) 

2172 

-3 

34,341.7 

2173 

-2 

34, 395. b 

2174 

-1 

34  ,43b.b 

2175W 

0 

34,472.3 

34,465.1 

34  ,4b4  .0 

2176M 

+  1 

34,473.8 

34  ,478  .4 

34  ,477.6 

2177 

+  2 

34,475.2 

217b 

+  3 

34,465.3 

2179 

+  4 

34,434  .5 

SDR1 

28.868 

27 . 8b7 

27.525 

SDR2 

4  .665 

1  .546 

1  .289 

SDR3 

5.150 

1.715 

1.323 

R 

2 

3 

3 

bs 

— 

— 

— 

°2 

-b.9690 

-6.7905 

-6.7095 

bl 

20.2643 

20.4286 

20  .4 

bU 

27.8762 

27.1619 

26.8381 

X1 

-  J 

34  ,341  .6 

34,341.6 

34  ,341  .8 

XV, 

34,396.7 

34,395.4 

34 ,395.8 

xll 

34  ,437.3 

34  ,436.8 

34,436.3 

xo 

j4  .465. 1 

34  ,4b4  .0 

3  4  ,  4  b  3  .  4 

Ki 

34  ,  47b. 4 

34,477.6 

34,477.1 

x« 

34,477.7 

J4  ,477.7 

34  ,477.4 

a' 

34  ,463.  1 

b4  ,464  .  2 

34  ,4b4  .  2 

r-3 

0.14 

0.  11 

-0.11 

r-2 

-1.17 

-0.47 

-U.  26 

r-l 

-1  .24 

-0.17 

0.31 

ru 

7.22 

1.11 

0  .  60 

ri 

-4.37 

0.77 

0.51 

r2 

-2.53 

-2.49 

-2.  lb 

r3 

2.  15 

1.14 

1  .09 

4C 


Ill  C  4    Two  Missing  Points 

Some  exploration  will  be  presented  here  of  the  effects 
of  different  treatments  of  two  suspect  values  when  they  are 
adjacent,  separated  by  a  single  value,  or  separated  by  two  values. 
First,  consider  a  situation  with  two  adjacent  missing  points  with 
no  other  suspect  values  in  the  7-point  data  segment  centered  on 
either  of  them.   It  would  appear  reasonable  to  use  simultaneous 
smoothing  using  the  data  segment  centered  on  either  one.   An 
example  of  this  in  Trial  2  data  is  presented  in  Table  7a  and 
Figure  7.   The  outputs  of  the  TI  59  calculator  smoothing  are  shown 
in  Table  7  using  the  data  segment  centered  at  t=  2352.   This 
example  is  of  some  interest  since  the  fitted  cubic  polynomial 
changes  drastically  with  the  shift  of  one  unit  in  the  data  segment 
location.   This  is  indicated  by  tne  coefficient   b-,  of  t    which 
is  positive  when  the  segment  is  centered  at  t  =  2352  ana  is 
negative  when  the  segment  is  centered  at  t  =  2353  (See  Section  III 
bl).   In  spite  of  this  difference  in  the  fitting  cubic 
polynomials,  the  smoothed  values  do  not  differ  drastically  from 
each  other  or  from  the  temporary  values  initially  used.   Whether 
the  differences  in  the  smoothed  values  shown  in  Table  7  are  of 
concern  to  potential  users  of  the  smoothed  data  is  uncertain.   If 
it  is  not,  then  simultaneous  smoothing  can  be  used  with  either 
missing  point  at  the  center  of  the  data  segment. 

Smoothing  of  these  points  using  simultaneous  smoothing  but 
alternating  the  center  between  the  missing  points  at  successive 
iterations  has  not  been  explored  since  only  one  smoothing  step 


4  I 


30  - 


25  - 


© 


® 


20  - 


15  - 


10 


-4-3-2-1     0     1     2     3     4 


Figure  7.   Adjacent  Missing  Points. 


42 


Table  7    Smoothing  Adjacent  Missing  Points 


t 

t! 

l 

x(1) 

1 

t! 

l 

x<2> 

1 

2349 

235U 

2351 

23  5  2M 

2353M 

2354 

2355 

2356 

-3 
-2 

-1 
0 

+  1 
+  2 
+  3 

33,015.3 
33,021.1 

33,022.0 
33,024.0 
33,026.1 
33,02b. 1 
33,033.0 
33  ,027.0 

-3 
-2 
-1 
0 
1 
2 
3 

SDRl 

SDR2 

SDR3 

1.334 

1.490 

.736 

2.443 

2.499 
1  .904 

R 

b3 
°2 
bl 

bo 

3 
0.1833 

0.0143 

1.2595 

-0.0571 

3 

-0. 2556 

-0.1405 
3.3432 
0.9819 

X-3 
X' 

-A 
Xll 

Hi 

*;* 

33,015.6 
33  ,020  .  2 
33  ,02^.7 
33  ,024.2 
33  ,025.6 
33  ,028.2 
3  3,033.0 

33  ,021  .6 
33  ,021  .  2 
33  ,023  .  5 
33,0  2b  .9 
33  ,029  .7 
3  3,030.6 
33  ,027.9 

V 

r-l 

rl 

r2 
r3 

-0.27 
0.86 
-0.74 
-0.17 
0.47 
-0.11 
-0.03 

(1)  Segment  center  at  t  -  2352 

(2)  Segment  center  at  t  =  Hz  j 


using  either  center  brings  residual  errors  for  both  replacement 
values  within  the  desired  level   (|r.|  <  1)  .   Such  alternation  of 
data  segment  centers  could  require  substantial  computational 
effort  using  the  TI59  calculator  and  some  increase  in  the  program 
and  computational  effort  on  a  large  computer. 
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III.   C  5   Two  Missing  Points  Separated  by  a  Single  Point 

When  two  missing  points  are  separated  by  a  single  point, 
the  obvious  choices  are  between  either  smoothing  first  one  missing 
point  using  the  data  segment  centered  on  it  and  then  the  other 
missing  point  using  the  data  segment  centered  on  it  and  using  the 
smoothed  value  for  the  other  missing  point,  or  smoothing  both 
values  simultaneously  using  the  data  segment  centered  on  the  point 
between  them.   This  situation  is  illustrated  in  Table  3  and 
Figure  8.   The  results  of  smoothing  first  for  the  missing  point  at 
t  =  2111  and,  subsequently,  tor  the  missing  point  at  t  =  2113  are 
shown  in  Table  8.   The  results  of  smoothing  first  at  t  =  2113  ana 
then  at  t  =  2111  are  also  shown  in  Table  8.   Smoothing  both  mis- 
sing points  simultaneously  produced  the  results  shown  in  Table  8 
(last  two  columns).   The  results  are  summarized  below. 


Smoothing       Smoothed   Values 


Procedure 


t  =  2111    t  =  2113 


Temp.  Values  \  33,389.8  |  33,37U.4 

I  j 

First  at  2111  33,389.5  !  33,868.7 

First  at  2113  j  33,889.2  33,87U.5 

Simultaneous   j  33,«87.9  j  33,870.3 


Note  that  the  greatest  difference  between  the  smoocned  values  is 
less  than  2  units.   If  this  difference  is  not  considered  to  be 
serious  then  tne  simpler  procedure  of  simultaneous  smoothing  could 
be  o referred. 
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Figure  3.   Separated  Missing  Points  (2BX) 
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There  is  some  concern  about  this  procedure,  however, 
because  of  the  large  residual  errors  at  t  =  2112  and  t  =  2114. 
This  concern  is  also  supported  by  the  large  values  of  the  SDK's 
(the  Standard  Deviations  of  the  Residual  hirrors.   on  examination 
of  figure  a  it  can  be  seen  that  there  are  two  possible  explana- 
tions of  the  large  values  of  the  SDK's.   The  first  is  that  the 
actual  vehicle  track  is  inadequately  represented  by  a  cubic  poly- 
nomial (Model  error).   The  other  is  that  the  noise  level  in  this 
path  segment  is  greater  than  normal.   The  decision  as  to  which  ex- 
planation is  correct  cannot  be  determined  from  the  data.   Hope- 
fully, vehicular  control  information  and  maneuver  capabilities 
will  be  of  use  here. 

Smoothing  the  value  at  t  =  2112  should  not  be  performed 
until  after  smoothed  values  have  been  established  for  the  missing 
points  so  that  its  observed  value  is  included  in  establishing 
their  smoothed  values.   unly  then  should  the  value  at  t  =  2112  be 
smoothed  using  the  smoothea  values  for  the  missing  points. 
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III.   C  6   Missing  Points  Separated  by  Two  Points 

When  missing  points  are  separated  by  two  observed 
values,  simultaneous  smoothing  appears  questionable  since,  what- 
ever data  segment  is  used,  one  of  the  missing  points  will  not  be 
adjacent  to  the  center  of  the  data  segment.   The  preferred  pro- 
cedure would  appear  to  be  to  smooth  one  of  the  missing  points 
first  using  the  data  segment  centered  at  that  missing  point,  then 
do  the  same  for  the  other  point.   If,  when  the  second  missing 
point  is  smoothed,  the  residual  error  of  the  smoothed  value  for 
the  first  missing  point  is  large  (arbitrarily,  greater  than  unity) 
it  */ould  seem  reasonable  to  resmooth  the  first  again. 

An  example  where  two  missing  points  are  separated  by  two 
observed  values  occurs  in  the  data  for  the  second  vehicle  (2bx) 
where  there  are  missing  points  at  t  =  2073  and  t  =  207t>.   The  data 
and  graph  are  presented  in  Table  y  and  Figure  9.   Smoothing  first 
for  the  missing  point  at  t  =  2073  produced  the  results  shown  in 
Table  9.   Since  the  residual  error  at  t  =  2076  is  less  than 
unity,  the  temporary  value  at  t  =  2076  was  not  subsequently 
smoothed  using  the  smoothed  value  at  t  =  2U73  as  suggested  aoove. 
Instead,  the  value  at  t  =  2076  was  smoothed  using  the  temporary 
value  for  the  missing  point  at  t  =  2073.   Again,  both  residual 
errors  were    within  the  specified  limit  of  unity.   It  would  appear 
that  the  suggested  procedure  of  smoothing  first  one  missing  point 
and  then  the  other  including  the  smoothed  value  for  the  first  is 
not  always  necessary.   That  it  «/as  not  in  this  example  is  no 
guarantee,  however,  that  it  may  not  be  desirable  in  other  cases. 
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Table  9.   Missing  Points  Separated  by  Two  Points  (2BX) 


t . 

1 

x  . 

l 

begment 

Center 

2073 

2076 

2074 

2075 

2070 

34  ,000  .6 

2071 

33,997.3 

! 

2072 

33,989.8 

20  7  3  M 

2074 
2075 
20  7  6  M 
2077 

33,991.0 
33,992.  1 
33,986.3 
33,986. b 
J>3  ,^86.8 

2078 

33,984.8 

2079 

33,978.7 

SDRl 

2.591 

1  ,85b 

2.42b 

1  .759 

SDR2 

2.517 

1  ,b8l 

2  .478 

1.914 

SDR3 

2.624 

1  .856 

2.795 

1.912 

H 

2 

2 

1 

1     1 

b3 

— 

— 

— 

— 

b2 

0.3131 

-0.2655 

-- 

-- 

bl 

-2.2036 

-0.975 

-17524 

-17032] 

bo 

-1.2524 

-1.0619 

3  3,990.0 

33,988.2 

1 
X-3 

34,000. 1 

33  ,9b>2.0 

33, 994. b 

3  3,991.3 

1 
X-2 

33  ,996.4 

33,989.7 

33,993.0 

33,990.3 

1 
X-l 

33,993.2 

33,y87.9 

33,991.5 

33,989.2 

1 

xo 

33  ,990.7 

33 ,986.7 

3  3,998.0 

33  ,988.2 

i 
Xl 

1 
X2 

3  3,988.8 
33, 987. b 

3  3,986.0 

33  ,  ^85.8 

33,988.5 
33, 98b. 9 

3  3,987.2 

33,^8b.  1 

1 
X3 

33,986.9 

33  ,98b.  1 

33  ,985.4 

33  ,985.  1 

r-3 

0.47 

-  1  .  0  0 

2.75 

-1.50 

L'-2 

0  .  94 

2.41 

-J. 23 

0.7  4 

r-l 

-3.42 

-1.62 

-0.  bl 

2.87 

ro 

0.30 

-0.0  8 

2.11 

-1.9 

rl 

3.29 

0.8  3 

-2.16 

0  .  57 

r2 
r3 

-1.25 

-0.99 

-0.39 

0.66 

-0.31 

! 

0.4  5 

1  .38 

-0.  30 
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As  a  further  exploration  of  this  example,  simultaneous 
smoothing  for  the  missing  values  at  t  =  2073  and  t  =  2076  was  per- 
formed using  data  segments  centered  at  t  =  2074,  and  at  t  =  2075. 
The  results  are  shown  in  Tables  9  also.   In  this  example,  data 
segments  centered  at  any  one  of  the  four  points  appears  to 
acceptable  for  establishing  smoothed  values  for  the  missing  points 
Subsequent  smoothing  should,  however,  still  be  performed  for  the 
values  at  t  =  2074  and  2075. 

Although  the  SDR's  are  reasonably  small  for  all  cnoices 
of  data  segments,  it  is  of  some  interest  to  compare  the  graph  in 
Figure  9  with  the  one  in  the  previous  section  (Figure  3).   The 
scales  on  the  y-axis  are  different  but  there  appears  to  be  some 
element  of  doubt  about  the  actual  path  here  also.   Note  that  the 
smoothing  procedure  used  second  order  polynomials  to  fit  the  seg- 
ments centered  at  t  =  2073  and  2076  but  used  first-order  polyno- 
mials to  fit  the  data  segments  centered  at  t  =  2074  and  2075. 
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Ill  D.   TREATMENT  UF  MURE  THAN  TWO  MISSING  POINTS  AND/OR  POTENTIAL 

OUTLIERS. 
1.   General  Discussion 

The  presence  of  more  than  three  questionable  values,  either 
missing  points  or  potential  outliers,  in  a  7-point  data  segment 
cannot  be  smoothed  to  establish  estimated  values  by  a  cubic  equa- 
tion.  When  there  are  three  questionable  values,  they  can  be 
treated  by  either  (1)  iterated  simultaneous  smoothing  or  (2)  es- 
tablishing the  cubic  equation  that  fits  the  remaining  four  points 
in  the  segment  exactly  and  then  using  that  cubic  equation  be  deter- 
mine values  for  the  three  questionable  points.   When  the  same  7- 
point  segment  is  used,  the  smoothing  treatment  (1)  should  converge 
to  the  exact  fit  (2).   An  example  of  this  situation  is  explored  in 
Section  IIIU2. 

Similarly,  if  there  are  four  questionable  values  in  a  given 
7-point  data  segment  the  remaining  three  observations  can  be  fitted 
exactly  by  a  secono  order  polynomial  (parabola)  or  iterated  simul- 
taneous smoothing  can  be  used  to  fit  the  parabola.   Also,  if  there 
are  five  questionable  values,  the  remaining  two  observations  can 
be  used  to  fit  a  first-order  polynomial  (a  straight  line)  to  these 
observations  by  either  method. 

It  should  be  noted  that  the  critical  number  of  observations  in 
a  7-point  data  segment  required  for  fitting  a  polynomial  of  order 
k   is  k  +  1  since  there  are  k+1  coefficients  in  the  polynomial.   If 
there  are  less  than  k+1  observations  available  then  the  polynomial 
cannot  .be  established  uniquely.   If  there  are  k  +  1  observations,  it 
can  either  be  fitted  exactly  or  approximated  by  simultaneous 


iterated  smoothing.   If  there  are  more  than  k+1  observations  then 
only  smoothing  is  appropriate. 

It  is  also  important  to  note  that  when  a  k    order  polynomial 
is  fitted  exactly  to  k+1  observations  the  standard  deviation  of  the 
residual  errors  (SDR)  is  zero.   In  essence,  the  noise  component  is 
absorbed  in  the  fitted  polynomial  and  no  estimate  of  the  magnitude 
of  the  noise  is  possible.   This  absorption  of  the  noise  component 
into  the  target  path  is  in  contrast  to  situations  (Section  IIIB3 
for  example)  where  polynomials  of  order  three  or  less  provide  in- 
adequate representations  of  the  vehicles  path  and  hence  part  of  the 
path  variations  are  treated  as  noise.   This  results  on  larger 
standard  deviations  of  the  residual  errors  (SDR).   It  is  worthy  of 
emphasis,  again,  that  a  large  value  for  SDR  could  be  caused  by 
either  a  large  noise  component  or  inadequacy  of  the  polynomial 
model  to  represent  the  actual  target  path.   It  is  important  to 
determine  which  cause  is  pertinent.   Potential  sources  for  this 
information  are  internal  control  data  for  vehicular  maneuvers,  and 
examination  of  plots  of  the  vehicle  path.   The  latter  would  oe  dif- 
ficult to  incorporate  into  a  data  smoothing  algorithm  for  automatic 
data  processing  (some  human  interaction  may  be  necessary.)   The  use 
of  internal  control  data  appears  to  be  a  better  approach  of  the 
goal  of  complete  automation  is  to  be  achieved. 
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Ill  D    2.   THREE  QUESTIONABLE  VALUES 

The  problem  with  three  questionable  values  in  a  7-point  data 
segment  will  be  illustrated  by  the  z-component  of  vehicle  A 
where  there  are  potential  outliers  at  t  =  2125  and  t  =  2127,  and 
a  missing  value  at  t  =  2128.   A  plot  of  values  of  the   z.'s  in  a 
region  containing  possible  7-point  data  segments  is  shown  in  Figure 
10  and  listed  in  Table  lUa.   The  fourth-order  differences  (A4)  are 
listed  in  the  third  column. 

Selection  of  the  appropriate  7-point  data  segment  is  the  first 
consideration.   Centering  it  at  t  =  2125  would  place  the  missing 
value  at  t  =  2128  at  the  end  of  the  segment  and  would  noc  appear  as 
desirable  as  centering  it  at  t  =  2126  or  at  t  =  2127  to  include  the 
value  on  the  other  side  of  the  missing  point  in  the  segment. 
Initially,  it  was  decided  to  center  the  segment  on  the  time  between 
the  potential  outliers  (c  =  2126)  so  both  potential  outliers  would 
be  adjacent  to  the  segment  center  and  the  missing  point  would  not 
be  an  end  poi  nt . 

The  7-point  L-S  Polynomial  Smoothing  program  was  used  to  per- 
form simultaneous  iterative  smoothing  of  the  three  questionable 
values  with  the  results  shown  in  Table  lUa  ana  the  fourth  column  in 
Table  lUa.   Eight  iterations  were  required  to  bring  the  residual 
errors  of  all  tnree  values  within  the  prescribed  level  (r.  <  1.0). 
fourth-order  sequential  differences  were  recalculated  and  are  shown 
in  column  5  of  Table  10a.   Note  that  no  potential  outliers  are  now 
indicated  although  the  value  of  A4,  at  t  =  2129  is  close  to  the 
selected  threshold  of  50. 
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Figure  10.   Three  Questionable  Values  ( 2A  Z ) 
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The  tour  observations  in  this  segment  that  were  not  considered 
questionable  were  next  fitted  by  a  cubic  polynomial. 


where 


z  (t[)  =  b0  +  bxt'  +  b2t'2  +  o3t'3 


t' 


2123   2124   2125   2126   2127   212«   2129 


-3 


-2 


-1 


+  1 


+  Z 


+  3 


so  that 


t  ! 

l 


-3 


-2 


+  3 


-426.3   -428.3   -465.3   -557.7 
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The  derivation  of  cubic  equation  fitting  these  four  points  exactly 
gives  (Table  10c) 

z  (t')  =  -465.3  -  26.46t'  -  2.96667t'2  +  0.5U6667t'3. 
Estimates  to  the  values  at  the  times  of  the  questionable  values 

(t1  =  -1,  +1/  +  2)  were  established  using  this  equation  and  are 

presented  in  column  6  of  Table  10a.   Sequential  differences  were 

recalculated  and  fourth  order  differences  presented  in  column  7  of 

Table  10a. 

Comparison  of  the  values  in  columns  2,  4,  and  6  indicate  the 

following : 

(a)  The  observed  values  at  t  =  2125  ana  t  =  2127  are  incon- 
sistent with  the  rest  of  the  observations  (at  t'  =  2125, 

z.  -  z.  =  17.3  and  z.  -  z*  =  17.9,  and  at  t1  =  2127,  z.  -  z.  =  19.9 
li  ii  11 

and  z.  -  z.  =  24.4)  so  that  both  potential  outliers  should  be 

ii  ^ 

reclassified  as  actual  outliers. 

(b)  The  smoothed  values,  z(t'),  are  fairly  close  to  the 
estimates  z  ( u ' )  after  eight  iterations.   More  iterations  should 
bring  them  still  closer  but  the  iterations  were  stopped  when  the 
residual  errors  were  reduced  to  less  than  unity  at  all  three 
suspect  t lmes . 

The  fourth  oraer  differences  in  column  7  of  Table  10a  indicate 
tnat  there  is  a  new  potential  outlier  at  time  t  =  2129.   On  refer- 
ence to  the  graph  (Fig.  10),  it  appears  tnat  the  observation  at 
this  time  is  not  necessarily  an  outlier  but  that  there  is  a  change 
in  the  path  o£  the  vehicle  which  cannot  be  adequately  approximated 
by  a  cubic  polynomial  beyond  this  point. 
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Table  lUc.   Exact  Solution:   Segment  center  at  t  =  212b 


t 

2123 
2124 
2125  W 
2126 

2127  W 

2128  M 
2129 


t  ' 
-3 
-2 

-1 

iJ 
+  1 
+  2 
+  3 


■426.3 
•423.3 


-557.7 


-426.3 
-428.3 
-442.3 
-465.3 
-494.2 
-526. Q 
-557.7 


Fit    Cubic 
(  1) 
(  2) 
(  3) 
(4) 


z*(t)    =   bu    +   bxt'    +   b   t ,2   +   b3t ,3 


t'    =    -3, -2,0, +3 


z     (-3) 


bU     -     ibl     +    9b2 


27b       =    -426.3 


z     (-2)     =    b      -  2b1     +    4b       -    8b       =    -428.3 

z*( u)     =    bM    =  -465.3 

0 

z*(+3)     =    b„    +  Jb,     +    9b„    +    27b,    =    -557.7 


b,,    =    4b5.3 


d'  ) 
(2') 
(4') 

(1"  ) 

(2"  ) 

(4"  ) 

(2"  ') 


Substitute    b    (3)    in    (1),     (2),     (4 
d      -    3b      +    9b      =    -13. U 


o       -    2b. 


4b-,    =    -18.5 


b   +  3b   +  9b„  =  U30.8 
solve  ( 1 ' )  for  b. 


b      =    -13.0  +  3b2  -  9b3 

Substitute  b"(l"  )  in  (2')r     (41) 

b..  -5b,  =  -5.5 
z     3 

D.  =  -2.96667 


Substitute  o   ( 4" J  in  ( 2"  ) 

D   =  0  .  5Ubbb7 
Substitute  b  (4"  )  and  b  (2"  ')  in  (  1 "  ) 


b ..  =  -2.96667 


b   =  U  .  50bbb7 


=  -2  b  .  -*  6 
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As  an  exploratory  exercise,  the  exact  solution  using  the  data 
segment  centered  at  t  =  2127  was  also  established.  The  results  are 
presented  in  columns  3  and  9  in  Table  10a.   It  is  interesting  to 
note  that  the  observed  value  at  t  =  2129  does  not  appear  as  a 
potential  outlier  in  the  recalculated  fourth  order  sequential 
differences.   Neither  does  the  observation  at  t  =  2130.   Since 
subsequent  fourth  order  differences  are  not  affected,  there  is  no 
potential  outlier  remaining  when  this  data  segment  is  used. 
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IV.   Conclusions  and  Recommendations 

Questionable  data  values,  either  potential  outliers  or  tempo- 
rary values  for  missing  points,  degrade  the  quality  of  smoothed  es- 
timates of  points  on  a  vehicular  path.   A  position  location  system 
which  omits  observations  at  every  eighth  observational  time 
(scheduled  missing  points)  makes  the  treatment  of  other  ques- 
tionable values  more  difficult  and,  if  the  latter  are  frequent,  can 
even  preclude  the  use  of  smoothing. 

Although  potential  outliers  are  treated  the  same  way  as  miss- 
ing points  in  smoothing,  a  specific  data  segment,  tney  can  produce 
greater  contamination  of  the  smoothing  process  and  should  be  given 
priority  in  any  smoothing  algorithm.   Also,  on  replacement  of  a 
potential  outlier  by  a  smoothed  value,  sequential  differences 
should  be  recalculated  to  determine  whether  other  potential  out- 
liers occur  in  its  vicinity.   It  is  important,  wherever  possible, 
to  establish  whether  a  potential  outlier  is  actually  an  outlier 
(a  wild  observational  value)  or  is  an  indicator  of  a  change  in  a 
vehicular  patn  that  cannot  be  adequately  represented  by  a  polyno- 
mial of  order  three  or  less.   Automation  or  this  identification  of 
the  cause  for  a  potential  outlier  may  be  facilitated  by  other 
sources  of  information  on  changes  in  vehicular  paths  such  as  in- 
ternal control  data.   An  alternative  source  of  this  information 
is  manual  observation  of  a  plot  of  the  observed  data  points  to 
establish  points  at  which  the  vehicular  yath  has  changed  so  that 
ic  cannot  be  expected  to  be  represented  by  a  polynomial  of  order 
three  or  less.   (The  latter  reduces  the  extent  to  which  automation 
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can  be  achieved  and  hence  the  incorporation  of  internal  control 
data  into  the  smoothing  process  is  preferred.) 

Isolated  questionable  values  cause  little  problem  since  they 
can  be  treated  simply  by  iterated  smoothing  to  establish  replace- 
ment estimated  values  consistent  with  the  other  observations  in 
the  7-point  data  segment  centered  at  the  time  of  the  questionable 
value.   The  presence  of  more  than  one  questionable  value  requires 
more  complex  treatment.   uccurrence  ot  two  or  three  such  values 
require  different  treatments  and  was  discussed  separately.   If 
more  than  three  questionable  values  occur  in  a  7-point  data  segment 
the  7-point  least  squares  smoothing  procedure  is  not  applicable. 
(Polynomials  of  order  one  or  two  could  still  be  considered  depend- 
ing on  the  number  of  questionable  values  but  should  be  avoided 
since  their  ability  to  represent  actual  vehicular  paths  is  ques- 
tionable.)  Such  data  segments  should  be  identified  for  Doth 
potential  users  of  the  smoothed  data  and  data  collectors. 

When  there  are  two  questionable  values  close  to  each  other 
both  nature,  missing  point  or  potential  outlier,  and  their  time 
separation  neeo  to  be  considered  in  establishing  the  appropriate 
treatment.     le  following  cases  and  their  treatments  appear 
reasonaDle : 

a.   Adjacent  questionable  values. 

(1)   If  the  two  questionable  values  consist  ot  a  potential 
outlier  and  a  missing  point,  then  the  two  should  be 
smoothed  simultaneously  using  the  data  segment 
centered  at  the  time  of  the  potential  outlier, 
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(2)  If  two  adjacent  questionable  values  are  botn  missing 
points,  then  they  should  also  be  smootnea  simulta- 
neously using  the  data  segment  centered  at  the  time 
of  one  of  them.   (The  choice  ot  center  may  affect  the 
resulting  smoothed  values  but  no  general  rule  tor 
preference  can  be  given.) 

(3)  Situations  in  which  adjacent  questionable  values  are 
both  potential  outliers  appears  to  be  unlikely  so  it 
is  not  considered.) 

b.  Two  questionable  values  separated  by  a  single  observation. 

for  the  reason  of  simplicity  of  the  smoothing 
algorithm  and  reduction  in  computation  the  two  values 
snould  be  smoothed  simultaneously  using  the  data  segment 
centered  at  the  observation  time  between  the  two 
questionable  values. 

c.  Two  questionable  values  separatee  by  more  than  one 
observation . 

bince  at  least  one  ot  the  questionable  values  cannot 
be  adjacent  to  the  7-point  segment  center,  it  would  be 
reasonaole  to  smooth  first  one,  then  the  otner,  returning 
to  the  first  for  resmoothing.   Priority  of  smoothing  is 
for  potential  outliers  and,  it  ooth  are  potential  out- 
liers, the  first  smoothed  should  be  the  one  with  the 
largest  fourth  order  sequential  difference. 

Situations  involving  three  questionable  values  could 
oe  smoothed  simultaneously  using  a  data  segment  centered  so 
that  all  three  are  as  close  to  the  center  ot  the  segment 
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as  possible.   A  substantial  number  of  iterations  may  be 
required  to  bring  the  three  residual  errors  to  within  the 
specified  level.   It  would  appear  preferrable  here  to  omit 
smoothing  and  to  fit  the  remaining  four  points  in  the  data 
segment  using  simultaneous  linear  equations  to  determine 
the  coefficients  of  the  cubic  equation  to  fit  these  four 
points  exactly.   (It  would  be  possible  to  use  smoothing 
limiting  the  polynomial  to  order  two  or  less  but,  again, 
the  question  of  adequate  representation  of  the  target 
path  arises.)   Whether  simultaneous  smoothing  or  the  exact 
tit  is  used,  the  procedure,  in  essence,  treats  the  noise 
components  of  the  four  observations  as  part  of  the  vehi- 
cular path  instead  of  noise.   Thus  a  reduction  in  the 
quality  of  the  estimates  is  introduced  and  this  informa- 
tion should  be  indicated  to  both  potential  users  and  data 
collectors . 

The  material  presented  in  this  report  has  emphasized 
details  which  should  be  useful  in  understanding  the 
smoothing  process  and  in  implementing  an  appropriate 
program  for  smoothing  3-D  data  at  NUWES .   It  also  provides 
essential  background  for  an  investigation  of  the  quality 
of  3-D  data  and  for  the  establishment  of  P'igures  of  Merit 
tor  3-D  data  submitted  for  smoothing  which  is  to  follow. 
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